Skip to content

[DEMO] Compiler Optimization#3169

Open
DBooots wants to merge 56 commits intoKSP-KOS:developfrom
DBooots:compiler_optimizations
Open

[DEMO] Compiler Optimization#3169
DBooots wants to merge 56 commits intoKSP-KOS:developfrom
DBooots:compiler_optimizations

Conversation

@DBooots
Copy link

@DBooots DBooots commented Mar 14, 2026

Demo only at this time

This PR adds an optimization stage to the compiler. This stage converts the Opcodes into a three address code interim representation, upon which various optimizing passes are performed. The interim representation is then converted back into opcodes and emitted to the program context as normal.
The optimizing passes currently implemented trim opcodes without changing any control flow, except to remove branches that are shown to be inaccessible.

Currently Implemented Passes

Suffix Replacement
Replaces CONSTANT: fields with the constant value, and replaces aliased SHIP: fields with the direct alias. This saves 1 opcode per ship alias replaced, and allows constant folding of the CONSTANT values.
Constant Propagation
Where a variable is assigned a constant, it replaces accesses of that variable with the constant (cognizant of global variables and triggers messing with variable values). This allows constant folding of the assigned values.
Constant Folding
Performs arithmetic, including scalar math functions, to reduce formulas to their simplest form. This includes the CONSTANT: fields from Suffix Replacement and the propagated constant variables from Constant Propagation. This step saves between 1 and 3 opcodes for every folding optimization that is performed.
Dead Code Elimination
Eliminates any branches where the condition is known to be false after propagating and folding constants.
Peephole Optimizations
This pass does a number of small optimizations:

  • Replaces string indexing against a lexicon with suffix accessing, when the string is a valid identifier. This saves 1 opcode for every index set or get.
  • Replaces parameterless suffix calls with a direct suffix access (since the suffix get opcode checks if it's a method and invokes it for free). This saves 2 opcodes for each suffix method call, but feels like cheating.
  • Replaces calls to VectorDotProduct() with a simple multiplication opcode. This saves 2 opcodes for every VectorDotProduct() that is performed.
  • Replaces double negation or double logical not operations with the original value. This probably doesn't come up often, but it saves 2 opcodes every time.
  • Replaces not->branch[true|false] with branch[false|true]. This saves 1 opcode every branch simplified.
  • Performs algebraic simplification (not necessarily of constants) to reduce the number of opcodes. E.g. A*B+A*C=A*(B+C), X*X*X = X^3, or -A+B = B-A . This saves multiple opcodes wherever possible.

Scope Simplification
Any scope that does not define any variables is eliminated. This saves 2 opcodes for each scope that is collapsed.

Next Steps

  • More testing! While I've added some unit tests, which run correctly, and I've reviewed the opcode output from those tests to see that the optimization passes are doing what they are supposed to, unit testing only covers simple cases. I need to compile larger and more complicated programs and see what breaks.
  • Further optimization passes. The OptimizationLevel.cs file includes the roadmap for future passes that could be implemented. So far only the O1 (Minimal) level is finished.
  • Adding optimization level specification to the compile function. Unit testing has direct access to the Compile Options fed to the compiler. We'll need to add support for in-game compilation level selection. I think I've got it hardcoded to use no optimizations in the interpreter (prioritizing responsivity over performance) and Minimal for general compilation (running or compiling programs), but this is only a development hack.

DBooots added 30 commits March 1, 2026 22:53
…OptimizationLevel enum to select level of optimizaiton; BaseIntegrationTest uses no optimization. Default the interpreter to no optimization since it won't be worth it there.
…IRBuilder generates basic blocks, which IREmitter can then convert back to kRISC opcodes. This initial implementation does not accomplish any optimization yet (the opposite sometimes) but all current tests pass on the outputted opcodes.
…nalities with IRVariable but not be a subclass.
…or logical operation. This will be used for constant folding during compile. Hopefully the C# compiler will inline these back into the original Opcode methods, but the impact to execution should be minimal either way.
…heir most reduced form. Invalid operations found during folding throw a compilation exception, which should report the problematic line and column in the source.

Also add tests to confirm correct functionality (and improve current test suite for equivalent non-optimizing tests).
…d adding a virtual 'interim CPU' to accomplish the function call through the FunctionManager in order to avoid rewriting the function call logic.
…ng the AssemblyWalk attribute to discover all classes implementing IOptimizationPass.
…n be aliased with the alias (saves one opcode per access). This also replaces all constant() or constant: suffix accesses with the constant value, and does so in time for the constant folding pass.
…ally) every emitted opcode will have a location that makes sense.
…entation for whole-program optimizations (e.g. function inlining).
…d function and lock labels. Fix fallthrough jumps happening at the wrong time.
…optimizing of function code fragments. Add inspection methods to UserFunction and UserFunctionCollection to intercept the code fragments during the optimization pipeline.
…re something would be popped from the stack, but a BasicBlock's stack is empty at that point.
…of pass sort indices to give more space for expansion.
…. Note that scopes are not fully implemented yet.
…that 0/0=1 or 0/X=0 for all X (including 0) instead of throwing an exception. But that's already undefined behaviour so this should be acceptable.
…ter passes to do targeted constant folding after making a change.
DBooots added 23 commits March 11, 2026 16:41
…n. This should be used carefully for in-place restructuring of IR instructions.
…truction tree more concise and easier to write.
…ons are equal if their operation and operands are equal. Temporary values are equal if the sequence of operations to return them are equal.
…m. This fixes programs with parameters breaking when they check for an argument marker.
…izations, as well as larger algebraic simplifications. See the list in OptimizationLevel.cs and the complete set of algebraic simplifications in PeepholeOptimizations.cs. Includes a unit test for these operations.
…e confused with class Scope or class VariableScope...
…es scope pops that can be replaced by incrementing the return depth.
…ompared not just by their string name, but also by their scope.
…xceptions when storing a variable defined globally in another scope.
…h constants wherever it can. Variables that are global, or that are written to within a LOCK statement are left alone, and variables that are written to in a function are not replaced after the first call to that function, unless subsequently reset. This works across the reaching definition, although the current implementation is a bit messy.
@DBooots DBooots changed the title Compiler Optimization [DEMO] Compiler Optimization Mar 14, 2026
@nuggreat
Copy link

I don't think algebraic simplification can work without strong typing which kOS doesn't have. As while it should work for any scalar operations and most vector operations it likely breaks down once directions are are in the equations. This would be a basic example where it could break down for a vector operation x*x*x when x is a vector as x^3 is not a valid operation on a vector. And this below would be where it would give the wrong answer for combined vector and direction operations:

LOCAL a IS v(1,0,0).
LOCAL b IS r(0,90,0).
LOCAL c IS r(0,0,90).

PRINT c * a + b * a.  //should be about v(0,1,-1)
PRINT a * (c + b).    //algebraically expected to be about the same as above but is actually about v(0,1,0)

I have not compiled your branch with the optimizer to check and see if they actually do what I think they will or if would need to construct more elaborate cases. If you can successfully do type inference these cases can be caught but I would believe that is not easy to do.

@DBooots
Copy link
Author

DBooots commented Mar 14, 2026

I don't think algebraic simplification can work without strong typing which kOS doesn't have. As while it should work for any scalar operations and most vector operations it likely breaks down once directions are are in the equations. This would be a basic example where it could break down for a vector operation x*x*x when x is a vector as x^3 is not a valid operation on a vector. And this below would be where it would give the wrong answer for combined vector and direction operations:

LOCAL a IS v(1,0,0).
LOCAL b IS r(0,90,0).
LOCAL c IS r(0,0,90).

PRINT c * a + b * a.  //should be about v(0,1,-1)
PRINT a * (c + b).    //algebraically expected to be about the same as above but is actually about v(0,1,0)

I have not compiled your branch with the optimizer to check and see if they actually do what I think they will or if would need to construct more elaborate cases. If you can successfully do type inference these cases can be caught but I would believe that is not easy to do.

Oh shoot, you're absolutely right. The unit test environment doesn't work with non-scalars, so I had gotten complacent and forgotten about the other types where add and mul are valid operators but do different things. I had considered adding a type inference system, which you're right will be needed for algebraic reduction. That will come with formalizing an SSA form and rewriting the reaching definitions used in the constant propagation pass, which is currently quite messy.
Another next step should also be to extend the kOS encapsulated types to work in the unit test environment so that the type inference system can actually be verified.

@DBooots
Copy link
Author

DBooots commented Mar 16, 2026

Regarding type inference, I think the easiest option is to leverage the strongly-typed environment in which the code is being compiled. We would extend FunctionAttribute to add a (possibly optional) return type. While I'm at it, I would also add a property to indicate to the compiler that the method does not depend on game state and can be called at compile time if the inputs are immutable. For suffix accesses, if the incoming object type is known, the generic type argument of a given suffix can easily be found through reflection (perhaps cached in Structure.AddSuffix for snappier compiling). Those two easy-to-implement things should cover most of the work.
The harder part will be handling the Calculator classes. I think I would add a second method to Calculator to overload GetCalculator to work with two Types. I would also add abstract methods for the operations, taking two Types as operands. In implementation, these would mirror the structure of their 'real' methods and return the Type of the object it would expect to return.
Also, I take back my criticism of the limitations of the unit test environment not working for Vectors and Directions. I see now that that's to avoid invoking things that involve the UnityEngine, which don't work when the game isn't running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants